State of Statistical Data Editing and Current Research Problems

نویسنده

  • William E. Winkler
چکیده

1. INTRODUCTION This paper is my description of the state of statistical data editing and current research problems. It is not intended to be a complete description of all areas. Rather, it represents sub-areas of statistical data editing that I will describe in sufficient detail so that the discussion of a few research problems is more easily understood. I define statistical data editing (SDE) as those methods that are used to edit (i.e., clean-up) and impute (fill-in) missing or contradictory data. The end result of SDE is data that can be used for intended analytic purposes. These include primary purposes such as estimation of totals and subtotals for publications that are free of self-contradictory information. The published totals do not contradict published totals in other sources. Self-contradictory information might include groups of items that do not add to desired subtotals or totals for subgroups that exceed a known proportion of the total for the entire group. The uses of the data after SDE might be preparation of variances of estimates for a number of sub-domains and micro-data analyses. If only a few published totals need to be accurate, then an efficient use of resources may be to perform detailed edits on only a few records that effect the estimated totals. If many analyses need to be performed on a large number of sub-domains or if the full set of accurate micro-data are needed, then a very large number of edits, follow-up, and corrections may be needed. SDE can be used in all phases of survey processing. These phases include frame development, form design, proposed analytic purposes for which the data are collected, and quality assurance. This paper focuses primarily on SDE as it applies to analytic purposes, and places most emphasis on those procedures typically applied after the initial receipt of survey or other data. The main goal of SDE might be improved procedures and greater automation to enhance the ability of survey managers and analysts to provide accurate published estimates and micro-data. I broadly subdivide statistical data editing into two subcategories: (1) Fellegi-Holt (FH) methods and systems and (2) General methods and systems. FH systems are based on the Fellegi-Holt model of editing and typically add various options for imputation. General methods are all other methods. Whereas the paper by Fellegi and Holt (1976) appeared quite awhile ago, few systems have been implemented because of the difficulty in developing …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Adaptive Approach to Increase Accuracy of Forward Algorithm for Solving Evaluation Problems on Unstable Statistical Data Set

Nowadays, Hidden Markov models are extensively utilized for modeling stochastic processes. These models help researchers establish and implement the desired theoretical foundations using Markov algorithms such as Forward one. however, Using Stability hypothesis and the mean statistic for determining the values of Markov functions on unstable statistical data set has led to a significant reducti...

متن کامل

Collaborative Output Tasks and their Effects on Learning English Comparative Adjectives

This study aimed to examine the effect of two types of collaborative output tasks on Iranian EFL learners’ comparative adjectives with two or more syllables. Thirty Iranian EFL learners participated in this study which were then divided into two experimental and one control groups; one experimental group received dictogloss task in 4-pairs and the other experimental group was given text reconst...

متن کامل

Comparing the Status of Publishing Scientific Journals in State Universities and Islamic Azad University Units

This descriptive-survey study aims to identify the problems of publishing scientific journals in state universities and Islamic Azad Universities. Statistical population includes all academic journals the data of which is registered in www.Magiran.com. Required information were collected using a questionnaire. Results of this research indicate that the problems of publishing scientific journals...

متن کامل

ساختار دهی آنی داده‌‌های مکانی ورودی GIS با تأکید بر عارضه راه

An important issue in implementation of a GIS system is preparation of data to be entered in GIS. To produce spatial data for GIS using photogrammetric techniques, conventional method is to apply photogrammetric and GIS systems individually (off-line procedure). This approach is costly, time consuming and somehow unreliable due to the fact that 3D photogrammetric model is not available at the ...

متن کامل

Writers on the Move: Visualizing Composing Processes Involved in Academic Writing

The present research study aimed to explore covert processes of editing and revision which were involved in writing four different academic text genres (i.e. abstract, conclusion, data commentary, and cover letter) in English language. To this end, six EFL learners with Persian as their mother were recruited to participate in this study. All the participants attended an induction session and ea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999